Four Types of Noise in Data for PAC Learning
نویسنده
چکیده
In order to be useful in practice, machine learning algorithms must tolerate noisy inputs. In this paper we compare and contrast the effects of four different types of noise on learning in Valiant’s PAC (probably approximately correct), or distribution-free, model of learning [ 111. Two previously studied models, malicious noise [ 121 and random classification noise [ 11, represent the extremes. Malicious noise is intended to model the worst possible sort of noise, and in general only a very small amount of it can be tolerated [7]. On the other hand, Angluin and Laird [ l] have shown that for random misclassification noise instances are never altered but their labels are reversed with probability Y PAC learning can be achieved for any Y < l/2. They further show that any algorithm that chooses as its output concept some concept that minimizes disagreements with a polynomial size set of examples meets this bound. Here we extend Angluin and Laird’s result to malicious misclassification noise the noisy label is chosen adversarially instead of randomly. We also show that if one considers only algorithms that work by min-
منابع مشابه
Taguchi Modeling for Techno-Economical Evaluation of Cr+6 Removal by Electrocoagulation Process With the Aid of Two Coagulants
The research aimed to apply the Taguchi method for techno-economical evaluation of Cr+6 removal using the electro-coagulation process with the aid of two different coagulants (FeCl3 and PAC). Taguchi orthogonal array L27 (313) was applied for analyzing the effect of four variables including initial pH, reaction time, current density and coagulant types in an attempt to improve the chromium remo...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کاملStatistical Query Learning (1993; Kearns)
The problem deals with learning {−1, +1}-valued functions from random labeled examples in the presence of random noise in the labels. In the random classification noise model of of Angluin and Laird [1] the label of each example given to the learning algorithm is flipped randomly and independently with some fixed probability η called the noise rate. The model is the extension of Valiant’s PAC m...
متن کاملGeneral Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Bounding
We derive general bounds on the complexity of learning in the Statistical Query model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the Statistical Query model. This new model was introduced by Kearns [12] to provide a general framework for efficient PAC learning in the presence of class...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Lett.
دوره 54 شماره
صفحات -
تاریخ انتشار 1995